Constructing a reliable Web graph with information on browsing behavior
نویسندگان
چکیده
Page quality estimation is one of the greatest challenges for Web search engines. Hyperlink analysis algorithms such as PageRank and TrustRank are usually adopted for this task. However, low quality, unreliable and even spam data in the Web hyperlink graph makes it increasingly difficult to estimate page quality effectively. Analyzing large-scale user browsing behavior logs, we found that a more reliable Web graph can be constructed by incorporating browsing behavior information. The experimental results show that hyperlink graphs constructed with the proposed methods are much smaller in size than the original graph. In addition, algorithms based on the proposed “surfing with prior knowledge” model obtain better estimation results with these graphs for both high quality page and spam page identification tasks. Hyperlink graphs constructed with the proposed methods evaluate Web page quality more precisely and with less computational effort. HIGHLIGHTS 1. With user browsing behavior information, it is possible to improve the performance of quality estimation results for commercial search engines. 2. Three different kinds of Web graphs were proposed which combines original hyperlink and user browsing behavior information. 3. Differences between the constructed graphs and the original Web graph show that the constructed graphs provide more reliable information and can be adopted for practical quality estimation tasks. 4. The incorporation of user browsing information is more important than the selection of link analysis algorithms for the task of quality estimation.
منابع مشابه
User Browsing Graph: Structure, Evolution and Application
This paper focuses on ‘user browsing graph’ which is constructed with users’ click-through behavior modeled with Web access logs. User browsing graph has recently been adopted to improve Web search performance and the initial study shows it is more reliable than hyperlink graph for inferring page importance. However, structure and evolution of the user browsing graph haven’t been fully studied ...
متن کاملOptimizing Membership Functions using Learning Automata for Fuzzy Association Rule Mining
The Transactions in web data often consist of quantitative data, suggesting that fuzzy set theory can be used to represent such data. The time spent by users on each web page is one type of web data, was regarded as a trapezoidal membership function (TMF) and can be used to evaluate user browsing behavior. The quality of mining fuzzy association rules depends on membership functions and since t...
متن کاملMining Web navigation patterns with a path traversal graph
With the expansion of e-commerce and mobile-based commerce, the role of web user on World Wide Web has become pivotal enough to warrant studies to further understand the user’s intent, navigation patterns on websites and usage needs. Using web logs on the servers hosting websites, site owners and in turn companies, can extract information to better understand and predict user’s needs, tailoring...
متن کاملInformation seeking on the Web by women in IT professions
The paper develops a behavioral model of Web information seeking that identifies four complementary modes of information seeking: undirected viewing, conditioned viewing, informal search, and formal search. In each mode of viewing or searching, users would adopt distinctive patterns of browser moves: starting, chaining, browsing, differentiating, monitoring, and extracting. The model is applied...
متن کاملdesigning and implementing a 3D indoor navigation web application
During the recent years, the need arises for indoor navigation systems for guidance of a client in natural hazards and fire, due to the fact that human settlements have been complicating. This research paper aims to design and implement a visual indoor navigation web application. The designed system processes CityGML data model automatically and then, extracts semantic, topologic and geometric...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Decision Support Systems
دوره 54 شماره
صفحات -
تاریخ انتشار 2012